Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: open file decriptors #4831

Closed
wants to merge 3 commits into from
Closed

fix: open file decriptors #4831

wants to merge 3 commits into from

Conversation

boquan-fang
Copy link
Contributor

@boquan-fang boquan-fang commented Oct 8, 2024

Resolved issues:

Partially solve #4005.

Description of changes:

  • Use s2n_io_pair_close and s2n_io_pair_close_one_end to close all opened AF_UNIX socket.
  • Clean up all of opened /dev/urandom.

Call-outs:

  • I didn't add a test in this PR. The test to detect all opened fds will be in a separate PR.

Testing:

  • Test locally.
    • Add --track-fds=yes to CTest memcheck and direct all Valgrind output to MemoryTester.log files.
    • Search in VS Code for Open File Descriptors of AF_UNIX or /dev/urandom. Nothing was captured.
    • Manually go through every log files to see if there are anything that are missed.
    • This is similar to the check in ./codebuild/bin/test_exec_leak.sh.
      # run valgrind with track-fds enabled
      valgrind_log_dir=valgrind_log_dir
      for test_file in detect_exec_leak detect_exec_leak_finish; do
      LD_LIBRARY_PATH="build/lib:$TARGET_LIBCRYPTO_PATH/lib:$LD_LIBRARY_PATH" S2N_VALGRIND=1 \
      valgrind --leak-check=full --show-leak-kinds=all --errors-for-leak-kinds=all \
      --run-libc-freeres=yes -q --gen-suppressions=all --track-fds=yes \
      --leak-resolution=high --undef-value-errors=no --trace-children=yes \
      --suppressions=tests/unit/valgrind.suppressions --log-file="build/$valgrind_log_dir/$test_file" \
      build/bin/$test_file
      # search for all leaked file descriptors, excluding the valgrind_log_dir file
      cat build/$valgrind_log_dir/$test_file | \
      grep "Open file descriptor" | \
      grep --invert-match $valgrind_log_dir \
      && fail "file leak detected while running $test_file"
      done

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@goatgoose goatgoose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a LOT to review here. Would you mind splitting these changes up into smaller PRs to make sure we don't miss anything while reviewing?

DEFER_CLEANUP(struct s2n_test_io_pair io_pair = { 0 }, s2n_io_pair_close);
struct s2n_test_io_pair io_pair = { 0 };
Copy link
Contributor

@goatgoose goatgoose Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't s2n_io_pair_close() close both the client and server file descriptors? Is it because s2n_io_pair_close() will abort if the client is already closed?

@@ -573,6 +573,8 @@ int main(int argc, char **argv)
S2N_ERR_CLOSED);
EXPECT_FALSE(s2n_connection_check_io_status(receiver, S2N_IO_READABLE));
}
/* Close the other end of pipe */
EXPECT_SUCCESS(s2n_io_pair_close_one_end(&io_pair, 1 - mode));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - S2N_PEER_MODE makes this slightly clearer:

Suggested change
EXPECT_SUCCESS(s2n_io_pair_close_one_end(&io_pair, 1 - mode));
EXPECT_SUCCESS(s2n_io_pair_close_one_end(&io_pair, S2N_PEER_MODE(mode)));

Comment on lines +454 to +455
/* Clean up with previously set functions */
POSIX_GUARD_RESULT(s2n_rand_cleanup());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what's happening here but want to make sure: it looks like check_drgb_version sets custom rand callbacks:

/* Over-ride the entropy sources */
POSIX_GUARD(s2n_rand_set_callbacks(nist_fake_entropy_init_cleanup, nist_fake_entropy_init_cleanup, generator, generator));

But since our default rand callback was already initialized, /dev/urandom was already opened. So when we go to cleanup in END_TEST, the nist_fake_entropy_init_cleanup callback is invoked, which does nothing. So /dev/urandom remains open. And this change sets the rand callbacks back to default so that the proper cleanup function is invoked. Is that correct?

If so, I think a better fix for this would be to properly call s2n_rand_set_callbacks before s2n_init(), like the documentation describes:

s2n-tls/api/s2n.h

Lines 586 to 588 in ce0234e

* Allows the caller to override s2n-tls's entropy functions.
*
* @warning This function must be called before s2n_init().

So the proposal would be to remove the s2n_rand_set_callbacks() call in check_drgb_version(), and instead call s2n_cleanup(), s2n_rand_set_callbacks(), and s2n_init() whenever a new rand callback is needed. Otherwise, this relies on s2n_rand_cleanup() being called twice (once manually and once in s2n_cleanup) in order to properly clean up its state, which seems wrong.

Comment on lines +193 to +194
/* Clean up urandom of the child process */
POSIX_GUARD_RESULT(s2n_rand_cleanup());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should probably just be calling s2n_cleanup?

Suggested change
/* Clean up urandom of the child process */
POSIX_GUARD_RESULT(s2n_rand_cleanup());
EXPECT_SUCCESS(s2n_cleanup());

Comment on lines +105 to +106
/* Child process exit don't close io_pair properly, so manually close it*/
EXPECT_SUCCESS(s2n_io_pair_close(&io_pair));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work if the io_pair was created in a scope that ends before the exit call? Something like this maybe?

            {
                DEFER_CLEANUP(struct s2n_test_io_pair io_pair = { 0 }, s2n_io_pair_close);
                EXPECT_SUCCESS(s2n_io_pair_init(&io_pair));

                uint8_t counter = 0;
                int result = 0;
                S2N_IO_RETRY_EINTR(result,
                        s2n_test_real_interrupt(io_pair.client, n_times, &counter));
                EXPECT_EQUAL(result, S2N_TEST_SUCCESS);
                EXPECT_EQUAL(counter, n_times);
            }
            exit(0);

@@ -208,6 +208,7 @@ static S2N_RESULT s2n_run_self_talk_test(s2n_test_scenario scenario_fn)
EXPECT_SUCCESS(s2n_connection_free(client));
EXPECT_SUCCESS(s2n_cert_chain_and_key_free(chain_and_key));
EXPECT_SUCCESS(s2n_config_free(config));
EXPECT_SUCCESS(s2n_io_pair_close_one_end(&io_pair, S2N_CLIENT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there still be a DEFER_CLEANUP on this io_pair?

@boquan-fang
Copy link
Contributor Author

I will divide this PR into three different smaller PRs. This PR will be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants